enhance(test-benchmark): use config file for fixed opcode count scenarios by spencer-tb · Pull Request #1790 · ethereum/execution-specs

spencer-tb · 2025-11-14T17:45:48Z

🗒️ Description

This PR adds a CLI tool benchmark_parser to automatically scan benchmark tests and generate a configuration file .fixed_opcode_counts.json for the --fixed-opcode-count feature from #1747.

Key Changes

New CLI tool: uv run benchmark_parser
- Uses Python AST to scan tests/benchmark/ for tests with @pytest.mark.repricing marker
- Extracts opcode patterns from @pytest.mark.parametrize decorators
- Generates .fixed_opcode_counts.json at repo root with opcode counts mapping
- Supports --check mode for CI validation: uv run benchmark_parser --check
Config file format: .fixed_opcode_counts.json
- Gitignored (user-local configuration)
- All patterns default to [1] (1K opcodes)
- Users can customize counts per pattern, [1, 10, 100] for 1K, 10K, 100K, by manually editing the file
- Custom counts are preserved when re-running the parser.
Help text improvements:
- Added benchmark options to fill --fill-help and execute remote --execute-remote-help
- Simplified help text with examples for --gas-benchmark-values and --fixed-opcode-count
Test updates:
- Renamed op parameters to opcode in test_arithmetic.py for consistency

Usage

Generate/update config (first time or after benchmark test changes): uv run benchmark_parser
Customize counts by editing .fixed_opcode_counts.json:

{
  "scenario_configs": {
    "test_codecopy.*": [
      1
    ],
    ...
}

Run with configured opcode counts:

# Fill fixtures (useful for fast one shot checks if count is 1)
uv run fill --fixed-opcode-count --fork Prague -m repricing tests/benchmark

# Execute on remote RPC
uv run execute remote --fixed-opcode-count --fork Prague -m repricing tests/benchmark --rpc-seed-key <key> --rpc-endpoint <url> --chain-id <id>

Fill works correctly and I tested execute remote on Hoodi with 1K opcode count set, the latest txs: https://hoodi.etherscan.io/address/0x83fd666bfb2b345f932c3e4e04b6d85e5ed3568d

Future Items

Add CI for fill/execute with --fixed-opcode-count after generating the config file.
Verify --fixed-opcode-count with debug_traceTransaction using execute hive.
Add documentation & framework tests.

🔗 Related Issues or PRs

#1747

✅ Checklist

All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
```
uvx tox -e static
```
All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
All: Considered adding an entry to CHANGELOG.md.
All: Considered updating the online docs in the ./docs/ directory.
All: Set appropriate labels for the changes (only maintainers can apply labels).

LouisTsai-Csie

Thanks a lot for this! I left some suggestions, but I’m happy to discuss further. I’ll share this with Kamil to confirm it aligns with their needs.

tests/benchmark/configs/fixed_opcode_counts.py

packages/testing/src/execution_testing/cli/pytest_commands/plugins/shared/benchmarking.py

codecov · 2025-12-04T17:32:20Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.87%. Comparing base (2a6f9ee) to head (b98b267).
⚠️ Report is 18 commits behind head on forks/osaka.

Additional details and impacted files

@@               Coverage Diff               @@
##           forks/osaka    #1790      +/-   ##
===============================================
- Coverage        87.31%   83.87%   -3.45%     
===============================================
  Files              541      402     -139     
  Lines            32832    25101    -7731     
  Branches          3015     2285     -730     
===============================================
- Hits             28668    21053    -7615     
- Misses            3557     3609      +52     
+ Partials           607      439     -168

Flag	Coverage Δ
unittests	`83.87% <ø> (-3.45%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

LouisTsai-Csie

Add some suggestion for the parser! Thanks

tests/benchmark/configs/parser.py

packages/testing/src/execution_testing/cli/pytest_commands/plugins/shared/benchmarking.py

…rios

…d gas bench values

LouisTsai-Csie

Thanks!! Only a small logic adjustment is needed.

Here is an example using this test:

@pytest.mark.repricing(contract_balance=0)
@pytest.mark.parametrize("contract_balance", [0, 1])
def test_selfbalance(
    benchmark_test: BenchmarkTestFiller,
    contract_balance: int,
) -> None:
    """Benchmark SELFBALANCE instruction."""
    benchmark_test(
        code_generator=ExtCallGenerator(
            attack_block=Op.SELFBALANCE,
            contract_balance=contract_balance,
        ),
    )

When running in pure --fixed-opcode-count or --gas-benchmark-values mode, both parameters should be executed (contract_balance = 0 and 1).
This results in 4 tests being run (including both blockchain test and blockchain engine test combinations).

Example run (without repricing marker)
The full benchmark test suite should run, producing 4 tests:

fill -v tests/benchmark/compute/instruction/test_account_query.py::test_selfbalance --gas-benchmark-values 10 -m benchmark --clean

fill -v tests/benchmark/compute/instruction/test_account_query.py::test_selfbalance --fixed-opcode-count 1 -m benchmark --clean

With the repricing marker
Only the cases with contract_balance = 0 should run, so it should produce 2 tests:

fill -v tests/benchmark/compute/instruction/test_account_query.py::test_selfbalance --gas-benchmark-values 10 -m repricing --clean

fill -v tests/benchmark/compute/instruction/test_account_query.py::test_selfbalance --fixed-opcode-count 1 -m repricing --clean

Edit:

Regarding the check: if no marker is present, the test should still run when using the -m benchmark flag, but it should be ignored when using the -m repricing flag.

Still take test_selfbalance as example, if you remove the repricing marker, both of this should still be able to run

fill -v tests/benchmark/compute/instruction/test_account_query.py::test_selfbalance --gas-benchmark-values 10 -m benchmark --clean

fill -v tests/benchmark/compute/instruction/test_account_query.py::test_selfbalance --fixed-opcode-count 1 -m benchmark --clean

packages/testing/src/execution_testing/cli/pytest_commands/plugins/shared/benchmarking.py

…rk options

…-count

spencer-tb · 2025-12-09T13:11:16Z

Thanks for this! I've tried to address all you comments. Let me know if I missed anything:

1. Fix repricing filter to work with both benchmark options (0ca5109)

pytest_collection_modifyitems now checks for both --gas-benchmark-values and --fixed-opcode-count before applying the -m repricing filter.
Added a test to verify repricing filter works with both options.

2. Allow fixed-opcode-count for all benchmark tests (a18f800)

Removed the has_repricing check in pytest_generate_tests so --fixed-opcode-count works on all benchmark tests, not just repricing marked ones.

3. Warn when config file missing (a20f2b7)

Added a UserWarning when --fixed-opcode-count is provided without a value but .fixed_opcode_counts.json doesn't exist.

4. Update test to match new help text (dff1a46)

Fixed the failing CI test by updating the expected help text string.

5. Remove unnecessary generic_visit call in parser (33fa5ab)

Removed self.generic_visit(node) since we have early returns and don't use nested test functions.

LouisTsai-Csie

LGTM! Huge thanks for the effort!

…rios (ethereum#1790) * enhance(test-benchmark): use config file for fixed opcode count scenarios * chore(test-benchmark): update help messages for fixed opcode count and gas bench values * chore(test-benchmark): fix repricing filter to work with both benchmark options * chore(test-benchmark): allow fixed-opcode-count for all benchmark tests * chore(test-benchmark): warn when config file missing for fixed-opcode-count * chore(test-benchmark): update test to match new help text * chore(test-benchmark): remove unnecessary generic_visit call in parser * chore(test-benchmark): format test file

spencer-tb added C-feat Category: an improvement or new feature A-test-benchmark Area: execution_testing.benchmark and tests/benchmark P-high labels Nov 15, 2025

LouisTsai-Csie mentioned this pull request Dec 3, 2025

Enhance fixed opcode count workflow for benchmark #1834

Open

LouisTsai-Csie requested changes Dec 4, 2025

View reviewed changes

tests/benchmark/configs/fixed_opcode_counts.py Outdated Show resolved Hide resolved

tests/benchmark/configs/fixed_opcode_counts.py Outdated Show resolved Hide resolved

packages/testing/src/execution_testing/cli/pytest_commands/plugins/shared/benchmarking.py Outdated Show resolved Hide resolved

spencer-tb force-pushed the enhance/benchmarking/fixed-opcode-count-config branch from 7ca99dc to 7c68d19 Compare December 4, 2025 16:34

LouisTsai-Csie requested changes Dec 5, 2025

View reviewed changes

spencer-tb force-pushed the enhance/benchmarking/fixed-opcode-count-config branch 2 times, most recently from 5506d25 to 14b6b64 Compare December 5, 2025 19:15

enhance(test-benchmark): use config file for fixed opcode count scena…

697a7d0

…rios

spencer-tb force-pushed the enhance/benchmarking/fixed-opcode-count-config branch from 14b6b64 to 697a7d0 Compare December 5, 2025 19:18

spencer-tb marked this pull request as ready for review December 5, 2025 19:33

chore(test-benchmark): update help messages for fixed opcode count an…

4427257

…d gas bench values

spencer-tb mentioned this pull request Dec 5, 2025

enhance(ci): improve benchmark workflows #1853

Merged

5 tasks

LouisTsai-Csie requested changes Dec 8, 2025

View reviewed changes

spencer-tb added 5 commits December 9, 2025 12:52

chore(test-benchmark): fix repricing filter to work with both benchma…

0ca5109

…rk options

chore(test-benchmark): allow fixed-opcode-count for all benchmark tests

a18f800

chore(test-benchmark): warn when config file missing for fixed-opcode…

a20f2b7

…-count

chore(test-benchmark): update test to match new help text

dff1a46

chore(test-benchmark): remove unnecessary generic_visit call in parser

33fa5ab

chore(test-benchmark): format test file

b98b267

spencer-tb mentioned this pull request Dec 9, 2025

feat(test-benchmark): implement opcode count verification #1869

Merged

8 tasks

LouisTsai-Csie mentioned this pull request Dec 9, 2025

Gas Lighting Committee #8, Dec 9, 2025 ethpandaops/gas-lighting-tracker#21

Open

LouisTsai-Csie approved these changes Dec 9, 2025

View reviewed changes

spencer-tb merged commit d1e7e6b into ethereum:forks/osaka Dec 9, 2025
13 of 14 checks passed

spencer-tb mentioned this pull request Jan 7, 2026

feat(test-benchmark): updates and fixes for fixed opcode count #1985

Merged

5 tasks

Conversation

spencer-tb commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗒️ Description

🔗 Related Issues or PRs

✅ Checklist

Uh oh!

LouisTsai-Csie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

LouisTsai-Csie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LouisTsai-Csie left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

spencer-tb commented Dec 9, 2025

Uh oh!

LouisTsai-Csie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

spencer-tb commented Nov 14, 2025 •

edited

Loading

codecov bot commented Dec 4, 2025 •

edited

Loading

LouisTsai-Csie left a comment •

edited

Loading